37 research outputs found

    Information Extraction in an Optical Character Recognition Context

    Full text link
    In this dissertation, we investigate the effectiveness of information extraction in the presence of Optical Character Recognition (OCR). It is well known that the OCR errors have no effects on general retrieval tasks. This is mainly due to the redundancy of information in textual documents. Our work shows that information extraction task is significantly influenced by OCR errors. Intuitively, this is due to the fact that extraction algorithms rely on a small window of text surrounding the objects to be extracted. We show that extraction methodologies based on the Hidden Markov Models are not robust enough to deal with extraction in this noisy environment. We also show that both precise shallow parsing and fuzzy shallow parsing can be used to increase the recall at the price of a significant drop in the precision. Most of our experimental work deals with the extraction of dates of birth and extraction of postal addresses. Both of these specific extractions are part of general methods of identification of privacy information in textual documents. Privacy information is particularly important when large collections of documents are posted on the Internet

    Clima organizacional y calidad del servicio educativo en la I. E. Héroes del Cenepa N° 130 de Lima -2022

    Get PDF
    El propósito general de este trabajo de investigación es establecer la relación entre el Clima Organizacional y la Calidad del Servicio educativo en la I. E. Héroes del Cenepa N° 130 de Lima - 2022. Este estudio es de nivel correlacional con un diseño descriptivo-correlacional de tipo básico y un enfoque cuantitativo. La población total fue de 60 docentes, mientras que la muestra fue de 52 docentes. Para recolectar datos sobre la variable Clima Organizacional, se utilizó una técnica de encuesta y un cuestionario de Clima Organizacional desarrollado por el investigador con alta fiabilidad de Cronbach. Además, se utilizó la técnica de encuesta para la variable Calidad del Servicio y un cuestionario de Calidad del Servicio desarrollado por el mismo investigador con alta fiabilidad. Para el procesamiento de datos, se utilizó el programa estadístico SPSS. Los resultados del análisis de datos muestran una relación significativa entre el Clima Organizacional y la Calidad del Servicio en la I. E. Héroes del Cenepa - 2022, lo que se corrobora mediante la prueba de correlación de Rho de Spearman (0.703) y con un valor de p = 0.001

    Evidence of reassortment of pandemic H1N1 influenza virus in swine in Argentina: are we facing the expansion of potential epicenters of influenza emergence?

    Get PDF
    In this report, we describe the occurrence of two novel swine influenza viruses (SIVs) in pigs in Argentina. These viruses are the result of two independent reassortment events between the H1N1 pandemic influenza virus (H1N1pdm) and human-like SIVs, showing the constant evolution of influenza viruses at the human–swine interface and the potential health risk of H1N1pdm as it appears to be maintained in the swine population. It must be noted that because of the lack of information regarding the circulation of SIVs in South America, we cannot discard the possibility that ancestors of the H1N1pdm or other SIVs have been present in this part of the world. More importantly, these findings suggest an ever-expanding geographic range of potential epicenters of influenza emergence with public health risks.Facultad de Ciencias Veterinaria

    Why Are Outcomes Different for Registry Patients Enrolled Prospectively and Retrospectively? Insights from the Global Anticoagulant Registry in the FIELD-Atrial Fibrillation (GARFIELD-AF).

    Get PDF
    Background: Retrospective and prospective observational studies are designed to reflect real-world evidence on clinical practice, but can yield conflicting results. The GARFIELD-AF Registry includes both methods of enrolment and allows analysis of differences in patient characteristics and outcomes that may result. Methods and Results: Patients with atrial fibrillation (AF) and ≥1 risk factor for stroke at diagnosis of AF were recruited either retrospectively (n = 5069) or prospectively (n = 5501) from 19 countries and then followed prospectively. The retrospectively enrolled cohort comprised patients with established AF (for a least 6, and up to 24 months before enrolment), who were identified retrospectively (and baseline and partial follow-up data were collected from the emedical records) and then followed prospectively between 0-18 months (such that the total time of follow-up was 24 months; data collection Dec-2009 and Oct-2010). In the prospectively enrolled cohort, patients with newly diagnosed AF (≤6 weeks after diagnosis) were recruited between Mar-2010 and Oct-2011 and were followed for 24 months after enrolment. Differences between the cohorts were observed in clinical characteristics, including type of AF, stroke prevention strategies, and event rates. More patients in the retrospectively identified cohort received vitamin K antagonists (62.1% vs. 53.2%) and fewer received non-vitamin K oral anticoagulants (1.8% vs . 4.2%). All-cause mortality rates per 100 person-years during the prospective follow-up (starting the first study visit up to 1 year) were significantly lower in the retrospective than prospectively identified cohort (3.04 [95% CI 2.51 to 3.67] vs . 4.05 [95% CI 3.53 to 4.63]; p = 0.016). Conclusions: Interpretations of data from registries that aim to evaluate the characteristics and outcomes of patients with AF must take account of differences in registry design and the impact of recall bias and survivorship bias that is incurred with retrospective enrolment. Clinical Trial Registration: - URL: http://www.clinicaltrials.gov . Unique identifier for GARFIELD-AF (NCT01090362)

    Risk profiles and one-year outcomes of patients with newly diagnosed atrial fibrillation in India: Insights from the GARFIELD-AF Registry.

    Get PDF
    BACKGROUND: The Global Anticoagulant Registry in the FIELD-Atrial Fibrillation (GARFIELD-AF) is an ongoing prospective noninterventional registry, which is providing important information on the baseline characteristics, treatment patterns, and 1-year outcomes in patients with newly diagnosed non-valvular atrial fibrillation (NVAF). This report describes data from Indian patients recruited in this registry. METHODS AND RESULTS: A total of 52,014 patients with newly diagnosed AF were enrolled globally; of these, 1388 patients were recruited from 26 sites within India (2012-2016). In India, the mean age was 65.8 years at diagnosis of NVAF. Hypertension was the most prevalent risk factor for AF, present in 68.5% of patients from India and in 76.3% of patients globally (P < 0.001). Diabetes and coronary artery disease (CAD) were prevalent in 36.2% and 28.1% of patients as compared with global prevalence of 22.2% and 21.6%, respectively (P < 0.001 for both). Antiplatelet therapy was the most common antithrombotic treatment in India. With increasing stroke risk, however, patients were more likely to receive oral anticoagulant therapy [mainly vitamin K antagonist (VKA)], but average international normalized ratio (INR) was lower among Indian patients [median INR value 1.6 (interquartile range {IQR}: 1.3-2.3) versus 2.3 (IQR 1.8-2.8) (P < 0.001)]. Compared with other countries, patients from India had markedly higher rates of all-cause mortality [7.68 per 100 person-years (95% confidence interval 6.32-9.35) vs 4.34 (4.16-4.53), P < 0.0001], while rates of stroke/systemic embolism and major bleeding were lower after 1 year of follow-up. CONCLUSION: Compared to previously published registries from India, the GARFIELD-AF registry describes clinical profiles and outcomes in Indian patients with AF of a different etiology. The registry data show that compared to the rest of the world, Indian AF patients are younger in age and have more diabetes and CAD. Patients with a higher stroke risk are more likely to receive anticoagulation therapy with VKA but are underdosed compared with the global average in the GARFIELD-AF. CLINICAL TRIAL REGISTRATION-URL: http://www.clinicaltrials.gov. Unique identifier: NCT01090362

    26th Annual Computational Neuroscience Meeting (CNS*2017): Part 3 - Meeting Abstracts - Antwerp, Belgium. 15–20 July 2017

    Get PDF
    This work was produced as part of the activities of FAPESP Research,\ud Disseminations and Innovation Center for Neuromathematics (grant\ud 2013/07699-0, S. Paulo Research Foundation). NLK is supported by a\ud FAPESP postdoctoral fellowship (grant 2016/03855-5). ACR is partially\ud supported by a CNPq fellowship (grant 306251/2014-0)

    Fuzzy Information Extraction on OCR Text

    Full text link
    In this paper, we report on two experiments on identification and extraction of Date of Birth instances. The objective of these experiments is to increase the recall level by increasing the edit distance while obtaining a reasonable precision

    Automatic Redaction of Private Information Using Relational Information Extraction

    Full text link
    We report on an attempt to build an automatic redaction system by applying information extraction techniques to the identification of private dates of birth. We conclude that automatic redaction is a promising concept although information extraction is significantly affected by the presence of OCR error
    corecore